In addition to normal IT best practices and
redundant hardware, the DAG is the primary high-availability option for
Exchange 2010 Mailbox servers. A DAG is a collection of servers that provides continuous replication and availability for mailbox databases, as shown in Figure 1.
Continuous replication creates a passive database copy on another Mailbox server in the DAG, and then uses asynchronous log shipping to maintain the copies.
The continuous replication process follows these steps:
The active transaction log is written and then closed.
The Microsoft Exchange Replication service replicates the closed log to servers hosting the passive database copies.
Because each copy of the database is identical, the Log Inspector will examine the transaction logs for the following:
Verifies the physical integrity of the transaction log
Verifies that the header generation is not higher than the highest generation for the current database copy
Verifies the log header matches the generation of the file name
Verifies the log file signature in the header matches the log file
The transaction log is then placed in the defined transaction log directory.
The
Information Store then validates the transaction log and then applies
the logs to the database copy. The databases remain in sync.
A DAG also has the following characteristics:
Requires the Windows failover
clustering feature and uses an Enterprise version of Windows server
(Windows Server 2008 or Windows Server 2008 R2), although the
installation and configuration tasks occur with the Exchange
Server management tools. Exchange Server does not use Windows failover
clustering to handle database failover. Instead, it uses Active Manager
to manage the failover process.
Members must have the same operating system.
You
can add up to 16 servers to a single DAG and create up to 16 copies of
a database. Up to 100 databases can be mounted as either a passive or
active copy of the database on each server in the DAG.
Uses an evolution of the continuous replication technology that is available in Exchange 2007.
A DAG can be created after you install the Mailbox server. If a Mailbox server is hosting active mailbox databases, it can be added to a DAG later, it if meets the requirements.
Allows
you to move a single database between servers in the DAG without
affecting other databases. Failover occurs per mailbox database, not
for an entire server.
Allows up to 16 copies of a single database on separate servers. A server can only host one copy of each database.
Requires
the database and transaction log copies for each database to be stored
in the same path on all servers. For example, if you store Mailbox
Database 1 in D:\DB\Mailbox Database 1\ on Dallas-MB01A, you must also store it in D:\DB\Mailbox Database 1\ on all other servers that host copies of Mailbox Database 1.
Defines
the boundary for replication, failovers, and switchovers—only servers
in the DAG can host database copies. You cannot replicate database
copies to Mailbox servers that are not in the same DAG.
Does
not require that all databases have the same number of copies. In a
16-node DAG, one database can have 16 copies, whereas other databases
are neither redundant nor have varying number of copies.
In Exchange 2010 transaction log shipping
occurs over TCP sockets as opposed to the file share (Server Message
Block) used in Exchange 2007. You can view the current TCP port used
for replication by running Get-DatabaseAvailabilityGroup -Status | Format-List. The default TCP port used for replication is 64327. This can be set using the Set-DatabaseAvailabilityGroup -ReplicationPort cmdlet. For this change to take effect, you need to create the Windows Firewall exceptions for the new TCP port and then restart the Microsoft Exchange
Replication service on each node in the DAG. In the initial release of
Exchange 2010, when you created a DAG using the EMC, the DAG was
automatically configured to obtain an IP address from DHCP. To complete
the configuration and assign a static IP address, you had to use the
EMS. In SP1, the DAG can be configured with an IP address from within
the EMC.
The target member notifies
the member running the active copy of which transaction logs it expects
to receive. The source member then responds by sending the required
transaction log files. After the transaction logs are received from the
source server, the files are placed in the target server's Inspector
directory for processing. The logs are then inspected and verified for
integrity and the header is inspected. After passing inspection, a
transaction log is placed in the log directory on the target Mailbox
server. If the transaction log does not pass inspection the target
server will request it from the source up to three times before setting
the mailbox database copy to Failed. When a database copy status is
Failed, it will periodically attempt to copy the missing log files in
order to return the database to a state of Healthy. The target Exchange
server then plays the logs against the local copy of the database.
Before this transaction log shipping process can start, the database copy must first be seeded. Seeding
is the process of creating a consistent database copy on a DAG member
to act as a baseline that will be updated through continuous replication of the transaction log files. This can be accomplished using the following methods:
Automatic seeding Automatic seeding occurs during the creation of a new database.
Manually copying the offline database
This method involves dismounting the database and copying the database
file to the target server. If you do this, service will be interrupted
while the database is dismounted.
Using the Update-MailboxDatabaseCopy cmdlet You can use the Update-MailboxDatabaseCopy cmdlet in the EMS to seed a database copy.
Using the Update Database Copy Wizard You can use the Update Database Copy Wizard within the EMC to seed a database copy.
Database failover occurs
when the active database fails, and another copy of the database is
activated on another server in the DAG. This can occur because of a
number of failure types including: network, storage, and server
hardware. If a entire DAG member fails, each of the active highly
available databases will attempt to fail over to another configured DAG
member. A switchover occurs when an administrator initiates moving an
active database from one server to another.
Colin Lee
Technology Specialist, Unified Communications, Microsoft Corporation, Australia
In my opinion, Exchange 2007
is an evolutionary step in providing a complete high-availability
solution with continuous replication. This provides capability for high
availability, with CCR, and disaster recovery (DR), with SCR. Many
customers I have worked with implemented this solution for high
availability and DR with great success and were able to improve their
SLA, or internal operational level agreement. As with all new
technology there are areas for improvement and Microsoft continues to
evolve continuous replication with Database Availability Group (DAG) in
Exchange 2010. The introduction of DAGs in Exchange 2010 adds
improvements that my customers requested as they were looking to
improve SLAs even further. These requests are often around the
active-passive nature of CCR and the ability to seamlessly failover if
the disk (or raid group) the database resides on fails.
Note:
In a CCR
implementation with multiple storage groups an outage of a disk did not
trigger a failover between the nodes and required some manual
intervention to initiate a recovery, whether that be a restore from
back for a DB or triggering a node failover.
Exchange 2010 solves this
issue with the capability that makes the database the unit of failover.
It also helps address the perception that a passive node was sitting
around idle. This is because up to 16 members can be put in a DAG, and
all members can host active mailboxes. This is a powerful perception
where upper management have a tendency to view "idle" servers as
inefficiencies a company can do without. The following comments are
from a customer that has migrated from Exchange 2007 with a CCR and SCR
implementation to Exchange 2010 with a DAG that spans multiple
datacenters.
"Moving to Exchange 2010 has
allowed us to provide a more highly available solution to our hotels
department whilst at the same time giving us (IT) increased simplicity
in managing the infrastructure. We have extremely high confidence in
our DAG with its ability for single database failovers as opposed to
our old CCR and SCR setup. Implementing our DAG together with
Datacentre Activation Coordination mode has also given us the
confidence to increase our Disaster Recovery scope from a single
storage group of critical mailboxes to the entire group, yet at the
same time maintaining an uncomplicated recovery process."
|